30 research outputs found

    Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases

    Get PDF
    We investigate the optimality for model selection of the so-called slope heuristics, VV-fold cross-validation and VV-fold penalization in a heteroscedastic with random design regression context. We consider a new class of linear models that we call strongly localized bases and that generalize histograms, piecewise polynomials and compactly supported wavelets. We derive sharp oracle inequalities that prove the asymptotic optimality of the slope heuristics---when the optimal penalty shape is known---and VV -fold penalization. Furthermore, VV-fold cross-validation seems to be suboptimal for a fixed value of VV since it recovers asymptotically the oracle learned from a sample size equal to 1−V−11-V^{-1} of the original amount of data. Our results are based on genuine concentration inequalities for the true and empirical excess risks that are of independent interest. We show in our experiments the good behavior of the slope heuristics for the selection of linear wavelet models. Furthermore, VV-fold cross-validation and VV-fold penalization have comparable efficiency

    Nonasymptotic quasi-optimality of AIC and the slope heuristics in maximum likelihood estimation of density using histogram models

    No full text
    48 p.We consider nonparametric maximum likelihood estimation of density using linear histogram models. More precisely, we investigate optimality of model selection procedures via penalization, when the number of models is polynomial in the number of data. It turns out that the Slope Heuristics rst formulated by Birgé and Massart [10] is satised under rather mild conditions on the density to be estimated and the structure of the considered partitions, and that the minimal penalty is equivalent to half of AIC penalty

    Optimal upper and lower bounds for the true and empirical excess risks in heteroscedastic least-squares regression

    Get PDF
    58p.We consider the estimation of a bounded regression function with nonparametric heteroscedastic noise. We are interested by the true and empirical excess risks of the least-squares estimator on a nite-dimensional vector space. For these quantities, we give upper and lower bounds in probability that are optimal at the rst order. Moreover, these bounds show the equivalence between the true and empirical excess risks when, among other things, the least-squares estimator is consistent in sup-norm towards the projection of the regression function onto the considered model. Consistency in sup-norm is then proved for suitable histogram models and more general models of piecewise polynomials that are endowed with a localized basis structure
    corecore